Improved recognition of native-like protein structures using a combination of sequence-dependent and sequence-independent features of proteins.

نویسندگان

  • K T Simons
  • I Ruczinski
  • C Kooperberg
  • B A Fox
  • C Bystroff
  • D Baker
چکیده

We describe the development of a scoring function based on the decomposition P(structure/sequence) proportional to P(sequence/structure) *P(structure), which outperforms previous scoring functions in correctly identifying native-like protein structures in large ensembles of compact decoys. The first term captures sequence-dependent features of protein structures, such as the burial of hydrophobic residues in the core, the second term, universal sequence-independent features, such as the assembly of beta-strands into beta-sheets. The efficacies of a wide variety of sequence-dependent and sequence-independent features of protein structures for recognizing native-like structures were systematically evaluated using ensembles of approximately 30,000 compact conformations with fixed secondary structure for each of 17 small protein domains. The best results were obtained using a core scoring function with P(sequence/structure) parameterized similarly to our previous work (Simons et al., J Mol Biol 1997;268:209-225] and P(structure) focused on secondary structure packing preferences; while several additional features had some discriminatory power on their own, they did not provide any additional discriminatory power when combined with the core scoring function. Our results, on both the training set and the independent decoy set of Park and Levitt (J Mol Biol 1996;258:367-392), suggest that this scoring function should contribute to the prediction of tertiary structure from knowledge of sequence and secondary structure.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Heterologous Expression of the Secale cereal Thaumatin-Like Protein in Transgenic Canola Plants Enhances Resistance to Stem Rot Disease

Canola (Brassica napus L.) is an important oilseed crop. A serious problem in cultivation of this crop andyield loss, are due to fungal disease stem rot caused by Sclerotinia sclerotiorum. The pathogenesis-related(PR) proteins have the potential for enhancing resistance against fungal pathogen. Thaumatin-like proteins(TLPs) have been shown to have antifungal activity on variou...

متن کامل

Phylogenetic analysis of HSP70 gene of Aspergillus fumigatus reveals conservation intra-species and divergence inter-species

Aspergillus fumigatus is a saprophyte fungus, widely spread in a variety of ecologicalniches and the most prevalent aspergilli responsible for human and animal invasiveaspergillosis. The first step to develop novel and efficient therapies is the identificationand understanding of the key tolerance and virulence factors of pathogens. The mainfocus of the present study is to perform the similarit...

متن کامل

On the transferability of folding and threading potentials and sequence-independent filters for protein folding simulations

Significant progress has recently been made in de novo protein structure prediction. The Rosetta method by Baker and colleagues, which is based on the idea of assembling putative models from a library of k-mer fragments derived from known three-dimensional protein structures, proved to be particularly successful. Critical components of the Rosetta approach are various sequence-dependent as well...

متن کامل

A novel chimeric recombinant protein PDHB-P80 of Mycoplasma agalactiae as a potential diagnostic tool

The aim of this study was to construct, expression of a novel recombinant chimeric protein consisting of Pyruvate dehydrogenase beta subunit (PDHB) and high antigenic region of integral membrane lipoprotein P80 of Mycoplasma agalactiae as a potential diagnostic tool. The full-length sequence of pdhb and a portion of antigenic regions of P80 were selected and analyzed by CLC ma...

متن کامل

Persian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods

Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Proteins

دوره 34 1  شماره 

صفحات  -

تاریخ انتشار 1999